{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 第一题：支持向量机的核函数\n",
    "\n",
    "实验内容：\n",
    "\n",
    "1. 了解核函数对SVM的影响\n",
    "2. 绘制不同核函数的决策函数图像\n",
    "3. 简述引入核函数的目的"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. 导入模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "from matplotlib.colors import ListedColormap\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 引入数据集\n",
    "from sklearn.datasets import make_moons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 引入数据预处理工具\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.preprocessing import StandardScaler"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 引入支持向量机分类器\n",
    "from sklearn.svm import SVC"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. 生成数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X, y = make_moons(n_samples = 100, noise = 0.3, random_state = 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize = (8, 8))\n",
    "cm_bright = ListedColormap(['#FF0000', '#0000FF'])\n",
    "plt.scatter(X[:, 0], X[:, 1], c = y, cmap = cm_bright, edgecolors = 'k')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. 数据预处理与划分"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = StandardScaler().fit_transform(X)\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4, random_state=42)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "训练集和测试集可视化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cm = plt.cm.RdBu\n",
    "cm_bright = ListedColormap(['#FF0000', '#0000FF'])\n",
    "plt.figure(figsize = (8, 8))\n",
    "plt.title(\"Input data\")\n",
    "\n",
    "# Plot the training points\n",
    "plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=cm_bright, edgecolors='k')\n",
    "\n",
    "# and testing points\n",
    "plt.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.5, edgecolors='k')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. 样本到分离超平面距离的可视化"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们使用SVM模型里面的decision_function方法，可以获得样本到分离超平面的距离。\n",
    "\n",
    "下面的函数将图的背景变成数据点，计算每个数据点到分离超平面的距离，映射到不同深浅的颜色上，绘制出了不同颜色的背景。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_model(model, title):\n",
    "    \n",
    "    # 训练模型，计算精度\n",
    "    model.fit(X_train, y_train)\n",
    "    score_train = model.score(X_train, y_train)\n",
    "    score_test = model.score(X_test, y_test)\n",
    "    \n",
    "    # 将背景网格化\n",
    "    h = 0.02\n",
    "    x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5\n",
    "    y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5\n",
    "    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),\n",
    "                         np.arange(y_min, y_max, h))\n",
    "    \n",
    "    # 计算每个点到分离超平面的距离\n",
    "    Z = model.decision_function(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)\n",
    "    \n",
    "    # 设置图的大小\n",
    "    plt.figure(figsize = (14, 6))\n",
    "    \n",
    "    # 绘制训练集的子图\n",
    "    plt.subplot(121)\n",
    "    \n",
    "    # 绘制决策边界\n",
    "    plt.contourf(xx, yy, Z, cmap = cm, alpha=.8)\n",
    "\n",
    "    # 绘制训练集的样本\n",
    "    plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap=cm_bright, edgecolors='k')\n",
    "    \n",
    "    # 设置图的上下左右界\n",
    "    plt.xlim(xx.min(), xx.max())\n",
    "    plt.ylim(yy.min(), yy.max())\n",
    "    \n",
    "    # 设置子图标题\n",
    "    plt.title(\"training set\")\n",
    "    \n",
    "    # 图的右下角写出在当前数据集中的精度\n",
    "    plt.text(xx.max() - .3, yy.min() + .3, ('acc: %.3f' % score_train).lstrip('0'), size=15, horizontalalignment='right')\n",
    "    \n",
    "    plt.subplot(122)\n",
    "    plt.contourf(xx, yy, Z, cmap = cm, alpha=.8)\n",
    "    \n",
    "    plt.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright, edgecolors='k', alpha=0.6)\n",
    "    \n",
    "    plt.xlim(xx.min(), xx.max())\n",
    "    plt.ylim(yy.min(), yy.max())\n",
    "    \n",
    "    plt.title(\"testing set\")\n",
    "\n",
    "    plt.text(xx.max() - .3, yy.min() + .3, ('acc: %.3f' % score_test).lstrip('0'), size=15, horizontalalignment='right')\n",
    "\n",
    "    plt.suptitle(title)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们尝试绘制线性核的分类效果图"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_model(SVC(kernel = \"linear\", probability = True), 'Linear SVM')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 作业1：请你尝试使用其他的核函数，绘制分类效果图"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. 高斯核，又称径向基函数(Radial Basis Function)，指定kernel为\"rbf\"即可"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. sigmoid核，指定kernel为\"sigmoid\"即可"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. 多项式核，指定kernel为\"poly\"即可"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# YOUR CODE HERE\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 作业2：简述为什么要引入核函数？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
